High-Order Markov Random Fields and Their Applications in Cross-Language Speech Recognition

نویسندگان

Jiang Zhipeng

Huang Chengwei

چکیده

In this paper we study the cross-language speech emotion recognition using high-order Markov random fields, especially the application in Vietnamese speech emotion recognition. First, we extract the basic speech features including pitch frequency, formant frequency and short-term intensity. Based on the low level descriptor we further construct the statistic features including maximum, minimum, mean and standard deviation. Second, we adopt the high-order Markov random fields (MRF) to optimize the cross-language speech emotion model. The dimensional restrictions may be modeled by MRF. Third, based on the Vietnamese and Chinese database we analyze the efficiency of our emotion recognition system. We adopt the dimensional emotion model (arousal-valence) to verify the efficiency of MRF configuration method. The experimental results show that the high-order Markov random fields can improve the dimensional emotion recognition in the cross-language experiments, and the configuration method shows promising robustness over different languages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hidden Markov Random Fields

A noninvertible function of a first order Markov process, or of a nearestneighbor Markov random field, is called a hidden Markov model. Hidden Markov models are generally not Markovian. In fact, they may have complex and long range interactions, which is largely the reason for their utility. Applications include signal and image processing, speech recognition, and biological modeling. We show t...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Pragmalinguistic and Sociopragmatic Recognition of High and Low Level EFL Learners

This study investigated the effects of English as foreign language (EFL) proficiency on what the authors of this study called pragmalinguistic and sociopragmatic recognition of EFL learners. To elicit the data, the study used two types of pragmatic measures: a pragmalinguistic recognition (PLR) test and a sociopragmatic recognition (SPR) test. Both tests were developed by the researchers of thi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

High-Order Markov Random Fields and Their Applications in Cross-Language Speech Recognition

نویسندگان

چکیده

منابع مشابه

Hidden Markov Random Fields

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

A Comparative Study of Gender and Age Classification in Speech Signals

Pragmalinguistic and Sociopragmatic Recognition of High and Low Level EFL Learners

عنوان ژورنال:

اشتراک گذاری